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Abstract 

The advent of social media has provided data and insights 
about how people relate to information and culture. While in- 
formation is composed by bits and its fundamental building 
bricks are relatively well understood, the same cannot be said 
for culture. The fundamental cultural unit has been defined 
as a "meme". Memes are defined in literature as specific fun- 
damental cultural traits, that are floating in their environment 
together. Just like genes carried by bodies, memes are carried 
by cultural manifestations like songs, buildings or pictures. 
Memes are studied in their competition for being successfully 
passed from one generation of minds to another, in differ- 
ent ways. In this paper we choose an empirical approach to 
the study of memes. We downloaded data about memes from 
a well-known website hosting hundreds of different memes 
and thousands of their implementations. From this data, we 
empirically describe the behavior of these memes. We sta- 
tistically describe meme occurrences in our dataset and we 
delineate their fundamental traits, along with those traits that 
make them more or less apt to be successful. 



1 Introduction 

Social media are virtual communities present on the Web 
that allow people to create, share, exchange and comment on 
pieces of content among themselves. Social media can be of 
different types: social networks, whose main aim is to allow 
people to keep in touch with hundreds of friends, like Face- 
book or Google-n; social bookmarking websites, that allow 
users to share links to interesting content present on the Web, 
like Reddit or Digg; blogging platforms, where the user it- 
self is posed at the center of the content creating process, like 
Wordpress or Blogger; and many more. The defining charac- 
teristic of social media is the many-to-many communication: 
the users are at the same time producers and consumers of 
information and knowledge. 

In social media, a popular concept is the one of "Internet 
Meme". A "Meme" is defined as the simplest cultural unit 
that can spread from one mind to another ( jDawkins 1976] ). 
A particular tune or a given rhetoric figure are examples of 
memes. An internet meme is a meme that spreads through 
the internet. Internet memes carry an additional property that 



ordinary memes do not. While preserving all the character- 
istics of ordinary memes, due to being spread through the 
internet, internet memes leave a footprint that is traceable 
and analyzable. Meme spreading through other media are 
not so easily traceable. For this reason, several researchers 
already studied internet memes in social media like Twitter 



( Wei et al. 2012)l or the blogosphere (Leskovec, Backstrom, 
[and Kleinberg 2009| ). 

However, most studies of internet memes share a common 
starting point: they observe the usage of memes by users 
who are interacting in a network. The main focus is on the 
interactions between the users and the influence of the topol- 
ogy of the network itself in the meme spreading process. In 
other words, these studies are not inquiring about the char- 
acteristics and the dynamics of memes per se, but the char- 
acteristics of the environment in which memes live, i.e. the 
social network that lies underneath social media. A meme is 
studied only in terms of its reaction to this environment. For 
this reason, we say that these works are studying the ecol- 
ogy of internet memes. We chose an alternative approach 
by focusing on the description of the characteristics and the 
behavior of some internet memes, independently from the 
networks they live in, like in ( ,Bauckhage 201 1[ ). 

In this paper we focus mainly on the analysis of inter- 
net meme data from QuickmemeQ Quickmeme is a website 
mainly used by social bookmarking users to create memes 
and share them on a social bookmarking website (Quick- 
meme was created by Reddit users to have a platform where 
to create and share memes on Reddit itself). Quickmeme is 
an instrument to track memes because it registers without 
any ambiguity the meme used, its rating and the moment in 
time in which the meme has been created. We are aware that 
Quickmeme is not containing most internet memes, nor all 
kinds of internet memes. However, by focusing on this eas- 
ily traceable source we are able to study a portion of internet 
memes in a controlled environment, allowing us for higher 
quality results that may be generaUzed to internet memes in 
general, in a second time. 

The aim of the paper is twofold, focused on meme com- 
petition and meme success. First, we want to prove that the 
memes we are tracking via the Quickmeme website have the 
fundamental characteristics to be called "memes". To do so. 
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we need to prove that they are interacting, as variation and 
heredity are noted to be two fundamental characteristics of 
memes ( |Dawkins 1976| ). Their interactions can take the form 
of competition and collaboration, where memes compete for 
the attention of the users and, in doing so, they can also co- 
operate resulting in higher success ratios. 

Second, we want to study the characteristics that make 
them successful memes in Quickmeme. Given some quan- 
titative characteristics of the memes, we want to understand 
how these characteristics are influencing their chances of be- 
ing successful memes. The difference with the rest of the 
internet meme literature is that we are not considering any 
social or network effect when studying meme success. The 
typical question of those studies breaks down to: "what are 
the characteristics of a social network which maximize in- 
formation spreading?". Instead, we think that the success of 
a meme may be influenced by where it appears for the first 
time, but ultimately it also has to have some characteristics 
that make it more apt to survive. 

To sum up, this paper makes the following contributions: 

• We are providing a study of the intrinsic characteristics of 
some memes, without focusing the network effect behind 
them, thus providing useful insights to better understand 
cultural dynamics, instead of social dynamics; 

• We provide insights about a novel source of data for 
memetics studies, by being able to detect and study 
memes in social media, using a data source like Quick- 
meme that provides high quality data about meme popu- 
larity; 

• We are able to detect competition and collaboration in the 
meme pool of Quickmeme, and to study the characteris- 
tics of successful memes present in the website. 

The remainder of the paper is organized as follows. 
In Section |2] we review the related literature on internet 
memes, information spread and memetics. We present our 
data source and the data cleaning process in SectionIS] Sec- 
tions |4] contains the Quickmeme analysis methodology and 
the evidences for meme relationships (to detect competition 
and collaboration) in our data source. In Section l5] we try 
to define the characteristics that lead to meme success in our 
dataset. Sectionl6]concludes the paper by pointing out future 
developments of this work. 

2 Related Work 

This paper is connected to several different tracks of re- 
search: the study of how the social network and the social 
media allows memes to spread (what we call "meme ecol- 
ogy"); the study of social media in general and memetics. 

In the Introduction we stated that there are several works 
studying internet memes. For example, ( Wei et al. 20I2| l 
studies the dynamics of competitions of memes spreading 
in two distinct social networks on different social media 



ological approaches: works like (Goyal, Bonchi, and Laksh- 
manan 2008'), ("Hoang and Lim 2012") and ("Berlingerio, Cos- 
cia, and Giannotti 2009| provide algorithms and frameworks 
to analyze how information spreads in a network of people. 
While those papers examine different cascade information 
spreads as independent, in ( Myers and Leskovec 2012| l dif- 
ferent spread events on the same network are analyzed at the 
same time, as different memes influence each other while 
trying to span on the same set of minds. Other computer sci- 



ence tools help us tracking memes over the Web ^Leskovec 
Backstrom, and Kleinberg 2009| l. As an application, ( [Fowler 
and Christakis 20I0| l deals with cooperative behavior 

We do beUeve that the social network analysis behind 
meme spreading is interesting, however it leaves unde- 
scribed the fundamental characteristics of the memes them- 
selves. While studying a system, it is important to know how 
the parts interact, but also how they function themselves, to 
have a better picture of what actually is happening in the 
world. Our paper tries to provide some contribution exactly 
in this last aspect: we do not consider network analysis, only 
the memes themselves. Closer to our work is (Bauckhage 



201 1 ), but here the author does not address the issues of col- 
laboration and competition in internet memes. 

Internet memes spread over social media websites. In 
computer science, many researchers have addressed the 
problem of describing the dynamics of user behavior in so- 
cial media websites. The topics touched include the emer- 



(Facebook and Twitter). Also ( Weng et al. 2012| addresses 
the problem of competition among memes, assuming that 
each user can follow only a handful of memes at the same 
time. Meme and information spread is also a problem defi- 
nition addressed in computer science with different method- 



gence of conventions in online social networks (Kooti et 
|al. 20I2[ ), how to select the critical features in the amount 
of data generated in these websites ( Tang and Liu 20T2) l, 
the privacy aspects (Querela et al. 2012 1, the "follow" and 
friendship dynamics in Twitter ( Pennacchiotti et al. 2012 1 
and ( [Adamic et al. 20lT] l, and many more. 

Finally, there are in literature also some publications 
about memetics, the proposed branch of science that should 
study memes. Some of the pioneering works are (Hofstadter 
1991), (Brodie 2004) and ( [Lynch I999| ). Our work is dif- 
ferent from these examples in literature as we are focused 
particularly on internet memes. Our approach is more data 
driven, as internet memes are more easily traceable as they 
leave a measurable footprint, this makes our paper a further 
contribution w.r.t these works. 

3 The Data 

As stated in the Introduction, social media provide many 
sources for meme analysis and we chose to focus on the 
website Quickmeme. There are other kinds of internet 
memes and different dynamics for their creation, thus we 
will only describe the meme types present in Quickmeme. 

Quickmeme works in the following way: the website pro- 
vides a system to create memes and then to create particular 
implementations of any other meme. We provide a couple 
of examples in Figures [TJa-b). Figure [TJ a) shows an imple- 
mentation of the meme "Socially Awkward Penguin". The 
meme is a picture with a left facing penguin on blue field. 
The socially awkward penguin is used to make jokes about 
anything related to social clumsiness: users use the tem- 
plate to describe a social situation where they misbehaved or 
they did not know how to properly react. Figure flTb) is the 




(a) Socially Awkward Penguin (b) Philosoraptor 

Figure 1: Two implementations of memes from Quickmeme. 



"Philosoraptor" meme and it is depicted as a Velociraptor in 
a philosophical pose in green field. It is commonly used for 
rhetoric questions and puns sounding philosophically deep. 

From these examples, we can define a meme as a com- 
bination of a picture and a tacit concept linked to the pic- 
ture. We refer to a meme with the symbol to. An imple- 
mentation of a meme is the picture and a particular humor- 
ous caption added to it following the tacit concept of the 
meme. An implementation of a meme m is referred as i^, 
so m — {i}^, ..., i^} is the collection of all n implementa- 
tions of meme m. 

Quickmeme provides a suitable data source for our study 
for different reasons. First, users are versatile in their meme 
use: they combine memes, they make them evolve with dif- 
ferent pictures or concepts, they use one meme against oth- 
ers. For example, from the socially awkward penguin, users 
created the "Socially Awesome Penguin", the "Socially Av- 
erage Penguin" and tens more. These dynamics of evolution, 
competition and collaboration are not much different from 
the same dynamics observed in the gene pool. 

Second, the website provides a scoring system, that can 
be used as a proxy of a meme's "success" in the meme pool. 
Without the need of an account, anybody can cast a vote 
saying that a particular meme was particularly funny, rec- 
ognizable or otherwise remarkable. When voting, the users 
have three choices: "awesome" meme (that adds 2 to the to- 
tal rating of the meme), "average" meme (that adds 1) and 
"bad" meme (that removes 1 from the total rating). 

In Quickmeme, a meme implementation can be "Fea- 
tured" if it is sufficiently popular We crawled the 499 
memes with at least one featured implementation present in 
the website on October 15th, 2012, restricting our crawl to 
only the implementations of the memes created since Octo- 
ber 9th, 2011. We chose to restrict to memes with featured 
implementations as they are more visible, thus they obtain 
more votes and generate more data points for our study. We 
downloaded 178, 801 meme implementations in total. 

Figure |2] shows the sum of ratings for all the memes. We 
can see that Quickmeme during the first weeks saw a growth 
in positive ratings cast. It reached its peak, in number of rat- 
ings, during the first quarter of 2012 starting a slow decline. 

In Figure [3] we show how many meme implementations 
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Figure 2: Temporal distribution of meme ratings. 
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Figure 3: Meme implementation rating distribution. 



i'J, obtained a particular rating score. We can identify a 
distribution roughly compliant with a power law: there are 
some very popular memes (spanning across five orders of 
magnitude) while the vast majority has a low rating. There 
are two noticeable deviations: a hunch around 500-2,000 
ratings and the exponential cutoff. While the second is 
present in many distributions, the first is more interesting. 
This is a "Front page" effect: popular memes are usually ex- 
posed in the front page of the website, so that they obtain 
extra ratings. This explains why more memes than expected 
tend to have a rating > 1, 000. 

4 Detecting Meme Relationships 

The aim of this section is to show that, in Quickmeme, the 
evolution of popularity of a meme is not dependent only on 
the meme itself, but it is influenced by the other memes. The 
influence may take place in two different ways: by means 
of competition and by means of collaboration. Competi- 
tion is defined as a negative influence of one meme over 
another: the success in terms of ratings of one meme pro- 
vokes lower ratings in another meme. Collaboration is the 
opposite effect. The influence may happen for many reasons 
(the memes are similar so a user with an idea for a caption 



must choose one of the two, one meme is used to criticize 
another meme for being useless or a new meme replaces an 
old meme), but our aim is simply to demonstrate that such 
influences exist. 

To see whether the popularity of two memes is not in- 
dependent, we look at the weekly total ratings of a meme. 
We refer to the total ratings of an implementation z"^ of 
meme m in the first week as wi(i"J. For each week, we 
sum all the ratings collected by the meme implementations 
of a meme, i.e wi{m) = X]?=i^i(*m) ^^'^ w{m) = 
{wi(7TT,), W2(to), ..., W53(to)} given that we collected data 
for 53 weeks. ■w{m) is referred as "meme rating vector". If 
an increase in success of one meme tends to cause a fall in 
success of another, and viceversa, then we say that there is 
a competition between the two memes. A meme's success is 
determined by looking at its rating vector. 

We discuss how we confront meme rating vectors via a 
null model definition in Section 143] We then discuss some 



examples of collaboration in Section 4.2 and we briefly 
present other instances of meme competition in Section [431 

4.1 Null Model 

Two meme rating vectors may not be comparable for two 
reasons. To understand those reasons, we depict two meme 
rating vectors in Figure|4|a): in red we have the ratings evo- 
lution of the meme "Sudden Clarity Clarence" (SCC) while 
in green we have the ratings evolution of "Courage Wolf" 
(CW). As we can see, SCC is, ceteris paribus, on average 
more popular than CW. As a result, we would find a com- 
petition even if there is none, as CW has always low ratings 
when SCC has high ratings regardless of their interactions. 

The second reason is that two competing memes may rise 
together only because they became more popular for inde- 
pendent reasons or the Quickmeme website has a spike in 
its user traffic. In other words, fluctuations in ratings may be 
explained by meme and weekly website popularity, without 
necessarily meaning that the two memes are in competition 
or in collaboration. A null model will help us eliminate these 
meaningless fluctuations and generated a normalized meme 
popularity vector, as depicted in Figure |4|b). 

The overall plan of action is the following: 

• We generate a null model; 

• We generate normalized meme rating vectors by con- 
fronting the observed meme rating vectors with the ex- 
pectations obtained from the null model; 

• We confront the normalized meme rating vectors to spot 
influences in meme popularity. 

Null Model Generation In the null model we want to as- 
sume there are no collaboration/competition relations, but 
we want to control for each meme's general popularity and 
the website popularity in a given week. This will be the base- 
line against which the observed data can be compared. To 
achieve this, we borrow a method from the ecology liter- 
ature ( jGotelli and Graves 2006) 1 parvey et al. 1983| l. We 
do this, because our problem has some similarities with the 
ecology models of species and ecosystems: memes are like 



animal species and their total ratings in one week is a quan- 
titative measure of success of the species. The ecosystem of 
the species is represented by the users connected to the web- 
site through that particular week. 

First, we store all meme rating vectors w{m) in a meme 
matrix Mm,w The meme rating vectors are the rows of the 
matrix, while the weeks are the columns. For example, the 
entry Mm.w{i,j) contains the ratings of meme nii for the 
week Wj , or Wj {mi). Next, we create a randomized null ver- 
sion of Mjn.w in which each entry Wj {mi) is random, but we 
preserve the total sums of the rows (memes) and the columns 
(weeks) of the matrix. By testing the ratio between the ob- 
served values and the expected values according to this null 
model we have a quantification about how much a meme 
is unusually (im)popular, given its general popularity and 
given the popularity of Quickmeme that week. 

We generated 100 of such null matrices using the vegan 
package for the R statistical software, that implements the 
Patefield algorithm ( Patefield 1981| l. From our 100 null ma- 
trices we extracted three different versions of a null ma- 
trix. The first is Nm,w, the master null model, generated 
by averaging the cells of all 100 null matrices. Each cell 
nj{mi) G Nm.w is calculated using the following equation: 
Km^) 



riilmi) — } I ^ , *w ^, 



, where w^{m,i) is the value of 

cell (i, j) in the fc-th null matrix. The second and the third 
are respectively N}^^ ^ and N'^-^ ^, constructed exactly like 
Nm.w, except that for N^^ ^ we use the first 50 random ma- 
trices and for N^^ ^ we use the remaining 50. 

Normalized Meme Rating Vectors We are now able to 
perform our data cleaning, to generate the normalized meme 
rating vectors that we have depicted in Figure Wb). Each 
observed cell value Wj{mi) G Mm,tu is tested against its 
corresponding null model cell nj{mi) G A^^^,. We then 
calculate each normalized value aj{mi) as follows: 



a J {mi) 




if Wj{mi) = A nj{mi) — 
if < nj{mi) < 1 

otherwise. 



What aj{m,i) is capturing is the ratio between the ob- 
served meme rating Wj{mi) and the expected meme rat- 
ing nj{m,i) given the website popularity in week j and the 
meme i overall popularity. When we do not observe nor ex- 
pect any rating, the ratio is undefined {^), but we are ac- 
tually observing as much as we expect, therefore we set 
(Tj{mi) = 1. When we expect less than one rating, i.e. 
nj{mi) < 1, we just deal with it as if we expect just one 
rating, or nj{mi) = 1, otherwise aj{m,i) would tend to oo 
even for small Wj{m,i). 

In Figure plb) we represent the normalized meme rating 
vectors of the two meme rating vectors depicted in Figure 
[4|a). As we can see, the two vectors can now be confronted. 

Rating Vectors Comparison We are interested in under- 
standing if the rising of a meme makes the fall of the other 
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(a) Timeline of ratings (b) Ratings vs null model 

Figure 4: Temporal evolution of ratings and expected ratings of two competing memes. 



meme more likely. For this reason, we calculate a set of con- 
ditional probabilities. 

Each meme m has a given probability of having a{m) > 
1, i.e. it is more popular than expected; and (T{m) < 1, i.e. it 
is less popular than expected. We refer to these probabilities 
asp,„(cr(m) > 1) and p,„(cr(m) < 1). We systematically 
calculate eight conditional probabilities for each couple of 
memes rrii and rrij : 

• PrnMimj) > l\(j{mj) > 1); 

• Pini{<y{mi) > l|cr(TOj) < 1); 

• Prai{cr{mi) < l\<j{mj) > 1); 

• Pmi{cr{m.i) < l\(j{mj) < 1). 

and the reverse conditional probabilities given a{mi). 
Pmi{<y{mi) > M'^i'iTT'j) > 1) is recording the probability 
of meme nii to be more popular than expected given that 
we observed that meme nij is more popular than expected; 
Pniiic^i'nii) > l\a{mj) < 1) is the probability of meme m^ 
being more popular than expected given that meme nij is 
less popular than expected; and so on. 

Then, we say that nii and rrij are in competition if all the 
following inequalities hold true: 

• PmMirrii) > 1) < PraMi'^i) > M'^i'l^j) < ^)'^ 

• Prni{cr{mt) < 1) <PmM{mi) < l\<j{mj) > 1); 



p,nj{(T{mj) > 1) < prn^{(7{mj) > l\a{mi) < 1); 
Pm,{o-{mj) < 1) <pm^{a{mj) < l|o-(mi) > 1). 



On the other hand, rrii and rrij are in collaboration if all 
the following inequalities holds true: 



p,„,(cr(mj) > 1) <p,n.{a{mi) > l\a{mj) > 1); 
Prni{cr{mt) < 1) < PmX<^{mi) < l\a{mj) < 1); 
Pm,{o-{mj) > 1) < pm.ia-imj) > l\a{mi) > 1); 
p,nji<j{mj) < 1) < Pm,i<y{mj) < l\a{mi) < 1). 



So, if the conditional probability of rrii being more (or 
less) successful given that ttij was less (or more) success- 
ful is higher than the independent probability of nii being 
more (or less) successful, and viceversa, then rui and nij 
are said to be in "competition". In the opposite case, when 
the conditional probability of rrij being more (or less) suc- 
cessful given that nij was more (or less) successful is higher 
than the independent probability of rrii being more (or less) 
successful, and viceversa, then nii and rrij are said to be in 
"collaboration". 

We discard competition and collaboration relationships 
if the two memes were not present together in the Quick- 
meme website for more than 25 weeks. We verify if the two 
memes are expected to generate a competition and collabo- 
ration relationships just by looking at their popularity and at 
the Quickmeme website popularity. We use one null matrix 
as the "observed" meme ratings and another null matrix as 
the "expected" values. If two memes rise and fall together 
in the null matrix, then an observed "collaboration" between 
them may be due solely to external factors. So, if they appear 
to "collaborate", they are in fact only behaving as expected. 

For this check we make use of the N^^ ^ and N^^ ^ null 
matrices. We calculate the aj{mi) values for the N}-^ ^ ma- 
trix with the same procedure described above and using as 
a null model the N"^ ^ matrix. Then, we calculate the prob- 
ability values. If we find competition and collaboration re- 
lationships between any two memes in this procedure, we 
remove that couple of memes from the possible competing 
or collaborating memes. 

Overall, the result of this procedure is a matrix C„i^m- 
In Cm,m both rows and columns are memes. Each entry of 
Cm,m can take three possible values: 1 if there is collabora- 
tion between the two memes, -1 if there is competition and 
if the two memes are independent from each other. 

4.2 Meme Collaboration 

The Cm,m matrix is a square matrix 499 x 499. Given that 
the relations are reciprocal, the C,n,m matrix is symmetric, 



Meme 


fci 


fc-i 


\m\ 


College Freshman 


308 


41 


974 


Jackass Boyfriend 


306 


34 


81 


Tech Impaired Duck 


304 


32 


555 


All The Things 


304 


33 


866 


I Got This 


296 


42 


107 


Hipster Dog 


36 


370 


208 


Scumbag Reddit 


64 


298 


820 


Lazy Bachelor Bear 


65 


280 


39 


Guido Jesus 


90 


270 


211 


Scumbag Parents 


88 


268 


675 



Table 1: The top five memes in collaboration (above half) 
and competition (bottom half). 



and thus there are 



499x498 



124, 251 possible relation- 



ships between memes. Ten of the found collaborations and 
competitions were also found in the null matrices, thus they 
were eliminated from Cm.m- In the end, 18,744 relation- 
ships (15.08%) are equal to —1, while 38, 619 relationships 
(31.08%) are positive. Thus, on average, each meme has a 
popularity anti-correlation (competition) with — '-^^ — = 
75.13 memes and it has a popularity correlation (collabora- 



tion) with 



3 8,619x 2 
499 



154.79 memes. From this fact we can 



deduct that, among the memes in Quickmeme.com, collab- 
oration is more common than competition. This is expected, 
as all the meme coexist in the same website and it is very 
easy to give a positive rating to every meme. For this reason, 
we first start considering the collaboration phenomena. 

We report in Table [Tithe top 5 memes with positive (top 
half of the table) and negative (bottom half of the table) 
popularity correlations. Column ki reports the number of 
memes correlated with meme m, column /c_i the number 
of memes anti-correlated with m and \rn\ is the number of 
meme implementations. The average popularity in number 
of meme implementations |to| of the top meme collabora- 
tors is higher than the average popularity of the top meme 
competitors. Our idea is that competition generates cluster 
of correlated memes, that we can call "meme organisms". 

To understand if there really are meme organisms, we 
need to understand if large groups of memes tend to correlate 
to each other This intuition corresponds to performing clus- 
tering on the C,n,m matrix. To this end, we remove from it all 
the —1 entries, which we consider as 0. Then, we perform a 
hierarchical clustering using the complete linkage method 
to calculate how distant a couple of memes, or a couple of 
meme clusters, are from each other In complete linkage, the 
distance between two clusters is computed as the maximum 
distance between a pair of objects, one in one cluster, and 
one in the other. By being very demanding on the similarity 
of memes, we make sure that we group in each cluster only 
very similar memes. 

We report in Figure l5] the dendrogram of the resulting 
clustering of the Cm,m matrix. As we can see, overall the 
cluster structure of the Cm^„i matrix shows us that the ma- 
trix itself has a modular structure, as there are well defined 
clusters. To obtain the clusters we need to define a distance 
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Memes 

Figure 5: The result of the hierarchical cluster of Cm.m ma- 
trix. For clarity, we show only the top part of the dendro- 
gram. 



'Whatdovoudowitha 




We used the hclust function in the R statistical software. 
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(a) Chemistry Cat (b) Dwight 

Figure 6: The meme cluster #45. 



threshold below which all memes are grouped. Different 
thresholds can be chosen and we leave as a future work the 
task of defining the best one to obtain the meme organisms. 
Here, we chose an arbitrary reasonable cut height and we 
call "organisms" the resulting meme clusters. 

As an illustration, we present an example of one memes 
organism. For clarity, out of the 103 clusters (composed by 
4.8 memes on average) we chose to show cluster #45, com- 
posed by two memes. The two memes are "Chemistry Cat" 
and "Dwight", and we show a meme implementation of both 
in Figurel6] Chemistry Cat is a meme used to make puns us- 
ing scientific concepts, while Dwight is a character of a pop- 
ular TV show, who is used to make pedantic remarks about 
common knowledge. The two organisms are intuitively part 
of very related geeky humor. 

In Table [2] we provide the process through which we say 
that the two memes are collaborating: their independent and 
conditional probabilities of being successful and unsuccess- 
ful. Chemistry Cat is successful on 20.75% of the weeks, but 
if also Dwight is successful its odds raise to 60%, while if 
Dwight is unsuccessful its odds lower to 11.63%. The odds 
of being unsuccessful raise when Dwight is unsuccessful. 
The inequalities hold also for the Dwight meme. 



Probability 


Chemistry Cat 


Dwight 


Pm,(cr(m,) > 1) 


20.75% 


18.87% 


Pmi{cr{mi) < 1) 


79.25% 


81.13% 


Pm,{(y{mi) > l|cr(mj) > 1) 


60.00% 


54.55% 


Pm,{^{mi) > l|cr(mj) < 1) 


11.63% 


9.52% 


Pm,{<^{mi) < l|cr(mj) > 1) 


40.00% 


45.45% 


Pmi{<y{mi) < 1 cr(mj) < 1) 


88.37% 


90.48% 



Table 2: The probabilities of the memes in cluster #45 of 
being successful and unsuccessful. 











(a) Against "First World Prob- (b) Against "Hipster" 

lems" 

Figure 7: Two meme implementations of Scumbag Reddit 
meme. 



4.3 Meme Competition 

While less widespread than collaboration, there is also com- 
petition in the meme pool: some memes have a dispropor- 
tionate amount of anti-correlations with other memes. 

In this case, one of the most evident explanation can be 
found in the way some users use the meme itself. An exam- 
ple of this is reported in FigurelT] FigurelTlreports two meme 
implementations of the meme "Scumbag Reddit" (SR), a 
meme used for self-critique of many widespread user behav- 
iors on the popular social bookmarking website Reddit.com 
(whose users make heavy use of Quickmeme.com). As we 
saw in Table [T] SR meme is the second most competitive 
meme. From Figure IT] we can understand why: often users 
adopt this meme to state something about other memes (both 
implementations refer to other memes, specifically Figure 
l7|a) refers to "First World Problems" memes and FigurelTtb) 
refers to "Hipster" memes). 

To prove this point, we report in Table [3] the conditional 
probabilities of SR meme with a First World Problem (FWP) 
meme and one Hipster meme. The fact that SR meme is pop- 
ular negatively influences the success odds of the other two 
memes and viceversa. 

5 Meme Success 

In this section we study the characteristics of successful and 
unsuccessful memes, aiming at a description of the char- 
acteristics that are coiTelated with meme success. First, we 
need to define the set of meme features we want to study. 



Probability 


Scumbag Reddit 


FWP Cat 


p,r,.{aimi 


> 1) 


26.42% 


11.32% 


p„^.(a(mi 


< 1) 


73.58% 


88.68% 


p„^.{a{mi 


> l|cr(mj) > 1) 


16.66% 


7.14% 


p„Jcr(mi 


> l|cr(mj) < 1) 


27.66% 


12.82% 


p„.(cr(mi 


< Ik(mj) > 1) 


83.33% 


92.86% 


p^.(cr(mi 


< l|cr(mj) < 1) 


72.34% 


87.18% 



Probability 


Scumbag Reddit 


Hipster Barista 


p„^.(a{mi 


>1) 


26.42% 


39.62% 


Pm.i(a{mi 


<1) 


73.58% 


60.38% 


Pr„.{a(mi 


> Ik(mj) > 1) 


9.52% 


14.29% 


p„.(cr(mi 


> Ik(mj) < 1) 


37.50% 


48.72% 


P„^i{<y(mi 


< Ik(mj) > 1) 


90.48% 


85.71% 


p„Jcr(mi 


< l|cr(mj) < 1) 


62.50% 


51.28% 



Table 3: The probabilities of Scumbag Reddit meme being 
successful and unsuccessful given an example of First World 
Problems and Hipster memes. 



Then, we establish a criterion to determine if a meme is suc- 
cessful or unsuccessful. Finally, we use a decision tree algo- 
rithm to describe how the features make the probability of 
one meme to be successful higher or lower. 

5.1 Meme Features 

We decided to focus on four main features of the memes. 
The features are: number of collaborators, number of com- 
petitors, the fact that a meme is in a meme organism and the 
entity of its popularity peak over its average popularity. 

The number of collaborators and the number of com- 
petitors are how many memes are recorded in competi- 
tion and in collaboration according to the procedure de- 
scribed in Section 4. 1 Instead of taking the absolute num- 
ber 



which may generate a decision tree too deep and dif- 
ficult to interpret, we decided to create three bins for each 
of these two features. Memes are classified as highly com- 
petitive/collaborative, average competitive/collaborative or 
lowly competitive/collaborative. We defined the thresholds 
to classify a meme into one of these three categories in or- 
der to have balanced, i.e. equally populated, bins for high 
and low competitors/collaborators. 

Figures [8] (a-b) depict the distributions for each meme of 
the number of competitors and collaborators, respectively. 
The black lines indicate our thresholds for the bins. We also 
sum up the values of the thresholds and how many memes 
fell into that particular bin in Table 4] To be highly compet- 
itive, a meme needs to have more than 77 competitors (and 
172 memes satisfy this constraint), while to be low compet- 
itive it is required to have less than 50 competitors, with 174 
memes falling into this bin. 

The third feature records whether a meme is part of a 
meme organism or not. We introduced the general method- 
ology to detect meme organisms in Section 4.2 Here, we 



make use of the organisms extracted in that section. To be 
in an organism, a meme is required to be present in a cluster 
with at least other two memes, given our definition of organ- 
isms in Section 4.2 Table HI reports the number of memes 
being in one organism and the ones that are not. 
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Figure 8: Distributions per meme. 



Feature 


> AVG 


= AVG 


< AVG 


Competition 


Tiireshold 
# Memes 


a; > 77 
172 


50 < a; < 77 
153 


a < 50 

174 


Collaboration 


Tlu-eshold 
# Memes 


X > 200 
195 


140 < X < 200 
116 


X < 140 
188 


In Organism 


Tiireshold 
# Memes 


True 
336 


N/A 
N/A 


False 
163 


Peak 


Threshold 

# Memes 


X > 25 
231 


N/A 
N/A 


a < 25 
268 


Successful 


Threshold 
# Memes 


True 
177 


N/A 
N/A 


False 
322 



Table 4: The thresholds and number of memes for our fea- 
ture bins. 



The last feature is the relative height of the popularity 
peak of a meme over its average popularity. Only a hand- 
ful of memes are constantly popular on Quickmeme. Often, 
memes have popularity peaks. These peaks do not necessar- 
ily happen when a meme is born, but when a user creates 
a very successful meme implementation of it and this im- 
plementation hits the front page. Then, many users produce 
all possible variants of this meme implementation in a lim- 
ited time span and many of these usually hit the front page 
tocpl At this point, either the meme stays popular or it fades 
back into oblivion. Figure |9] depicts the meme rating vec- 
tors of three different memes. "Bad Luck Brian" (BLB) has 
a peak several weeks after its creation and it manages to stay 
somewhat popular after its large peak. "Ridiculously Pho- 
togenic Guy" (RFG), instead, has an immediate peak larger 
than BLB, but it then fades into oblivion as an ephemeral 
trend. Finally, "Futurama Fry" (FF) has no large popularity 
peak at all, but it lives as successful meme. 

To calculate the last feature we take the amount of ratings 
a meme got in its most popular week and we divide it with 

^It is the so-called "Karma train" and there is a 
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Figure 9: Some meme rating vectors with different peaks. 



the average weekly rating of that meme. Like we did for col- 
laborator and competitor features, we create equipopulated 
bins for this feature. In this case, we create two bins: "Above 
average" for memes which have a popularity peak 25 times 
or more higher than their average popularity, and "Below 
Average" if the popularity peak is lower than 25 times the 
average popularity. The peak values for the memes reported 
in Figure|9]are 9.6 for BLB, 41.44 for RPG and 2.36 for FF 
We report the number of memes in both bins in Table |4] 

5.2 Measuring Success 

There are many alternatives to measure how successful a 
meme is. Since we already used a proxy of the number of 
ratings a meme gets as one of the meme features (the pop- 
ularity peak), we cannot use the total number of ratings of 
a meme as a measure. For this reason, we look instead at 
the number of implementations a meme gets: the more im- 
plementations the more we can say that a meme is persis- 
tent in people's mind. For each meme nii (the set of all of 
its implementations) we check |mi|: if it is higher than the 
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Figure 10: Distribution of the number of submissions per 
meme. 



average \m\ then we consider the meme successful, other- 
wise it is unsuccessful. Since we have 178, 801 meme im- 
plementations distributed in 499 memes (see SectionlSl), our 
threshold equals to 358 implementations. Figure [T0| depicts 
the distribution of the number of implementations per meme 
(the black line represents our threshold), and Table fflreports 
how many memes are (un)successful. 

As a result, each meme is now described as a Ust of fea- 
tures. For example, the "Socially Awkward Penguin" meme 
presented in Section [3] has 1,413 meme implementations 
making it a successful meme. SAP also has: 48 competi- 
tors, 285 collaborators, a popularity peak 3.91 x its aver- 
age popularity and it is in an organism with 11 memes. 
Therefore, according to Table |4] {Competition : < AVG, 
Collaboration ; > AVG, InOrganism : True, Peak : 
< AVG, Successful : True}. 

5.3 Extracting and Interpreting the Decision Tree 

Each meme generates a record with the procedure described 
in the previous sections. The records are then used as in- 
put to a decision tree algorithm. As the decision tree's tar- 
get variable we used "Success". We used the decision tree 
implementation described in ( [Borgelt and Timm 2000| l. In 
(JBorgelt and Timm 20001), the tree can be pruned if it con- 
tains too many levels. We also decided to merge some leaves 
of the tree to facilitate its interpretation. 



Our final decision tree is depicted in Figure 1 1 In Figure 
[TT| the nodes of our tree contain the share of memes in the 
particular node tree that are successful. In the root of the tree 
(the top node in the upper part of the figure), we report the 
baseline probability of a meme being successful. As we see 
in Table |4] there are 177 successful memes in our dataset. 



Therefore, the root node reports 



177 
499 



100 == 35.47%. 



Then, each node is split in two or more branches, accord- 
ing to the feature that best separates successful from un- 
successful memes. The most important feature is "Peak": if 
there was a high popularity peak then the odds of being suc- 
cessful lowers to 13.41%. Memes with low popularity peaks 
have a success probability of 54.47%. 



In the two resulting branches we can observe an interest- 
ing fact. High popularity peaks make the number of competi- 
tors negatively correlated to the odds of being successful: a 
large or average number of competitors make the successful 
odds drop to 6.25% and to 12.65%, while a small amount of 
competitors make the meme slightly more likely to be suc- 
cessful (17.3%). On the other branch, i.e. for memes without 
high popularity peaks, the number of competitors is actually 
positively correlated with success odds. Highly competitive 
memes have 75% chances of being successful, more than av- 
erage competitors (36.48%) and low competitors (38.02%). 
A possible explanation of this fact may be the following: if 
there is a popularity peak, then a meme will be used fre- 
quently only if it is not too similar to many other memes. If, 
instead, there is no popularity peak, users are likely to stick 
with it, and keep being used together with other memes, thus 
competing with them. 

If a meme has no peak and low number of competitors, a 
higher than average number of collaborators correlates to its 
success odds (we merged the two bins because the distinc- 
tion between them was not significant). 

On the other hand, if a meme has no peak and a high num- 
ber of competitors, the best correlation is registered in it be- 
ing in a coherent meme organism. Memes in this situation 
have increased odds of being successful (equals to 80.3%). 
If the meme is not in an organism, having a high number 
of collaborators still highly correlates with the success rate 
(79.31%), while an average or lower than average number 
of collaborators decrease success odds to 58.62%. 

To sum up: competition is anti-correlated with the odds of 
being successful if a meme also happen to have experienced 
popularity peaks. If it did not, it is a good thing only if the 
meme is part of a meme organism or at least it can count on 
many collaborations with other memes. Being a collabora- 
tive meme correlates with success. 

6 Conclusion and Future Works 

In this paper we presented an empirical approach to the 
study of memes, by analyzing a controlled dataset focus- 
ing on a particular class of internet memes. As opposed to 
the main approach in literature, which studies the charac- 
teristics of social networks in favoring meme spreads, we 
proposed a perspective without the use of network effects, 
as we think that studying the characteristics of memes can 
provide useful insights on cultural patterns, as opposed to 
the social patterns studied with networks. 

We studied the behavior of internet memes in the web- 
site Quickmeme.com. We proved that in Quickmeme there 
are actual memes as they compete and collaborate, some- 
times clustering in larger ensembles. We showed as differ- 
ent meme characteristics are associated with increased or 
decreased odds for the meme of being popular 

Our work paves the way to a number of future develop- 
ments. First, the network approach is not necessarily mu- 
tually exclusive with our work. Combining the meme com- 
peting studies in networks with our approach may provide 
useful insights about meme dynamics. Also, our approach 
in analyzing the temporal evolution of memes is still not 
perfect: for example it considers each snapshot in a meme's 



35.4 7% 

Peals^AVG 

54.47% 13.41% 

Comp > AVG Comp . AVG Comp < AVQ Comp-<^VG Connp|=AVG Com>-5>^VG 

75% 36.48% 38.02% 17.3% 12.65% 






30.3% 



68.96% 



48.48% 



27.02% 



Coll > AVG Coll <« AVG 



79.31% 58.62% 

Figure 1 1 : The decision tree describing the success odds given the meme characteristics. 



timeline as equally important, whereas a more dynamic ap- 
proach, such the one presented in (Berlingerio et al. 2010| l, 
can split it in different eras with distinct characteristics. 

We believe that this work can be part of and increased 
understanding about how memes work, with the hope of 
shredding more light on the complex dynamics of human 
cultural patterns. 
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